Phrase-Based Alignment Models for Statistical Machine Translation

نویسندگان

  • Jesús Tomás
  • Jaime Lloret Mauri
  • Francisco Casacuberta
چکیده

The first pattern recognition approaches to machine translation were based on single-word models. However, these models present an important deficiency; they do not take contextual information into account for the translation decision. The phrase-based approach consists in translating a multiword source sequence into a multiword target sequence, instead of a single source word into a single target word. We present different methods to train the parameters of this kind of model. In the evaluation phase of this approach, we obtained interesting results in comparison with other statistical models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NUT-NTT statistical machine translation system for IWSLT 2005

In this paper, we present a novel distortion model for phrase-based statistical machine translation. Unlike the previous phrase distortion models whose role is to simply penalize nonmonotonic alignments[1, 2], the new model assigns the probability of relative position between two source language phrases aligned to the two adjacent target language phrases. The phrase translation probabilities an...

متن کامل

PESA: Phrase Pair Extraction as Sentence Splitting

Most statistical machine translation systems use phrase-to-phrase translations to capture local context information, leading to better lexical choice and more reliable local reordering. The quality of the phrase alignment is crucial to the quality of the resulting translations. Here, we propose a new phrase alignment method, not based on the Viterbi path of word alignment models. Phrase alignme...

متن کامل

BIA: a Discriminative Phrase Alignment Toolkit

In most statistical machine translation systems, bilingual segments are extracted via word alignment. However, word alignment is performed independently from the requirements of the machine translation task. Furthermore, although phrase-based translation models have replacedword-based translationmodels nearly ten years ago, word-basedmodels are still widely used for word alignment. In this pape...

متن کامل

Translation Model Based Weighting for Phrase Extraction

Domain adaptation for statistical machine translation is the task of altering general models to improve performance on the test domain. In this work, we suggest several novel weighting schemes based on translation models for adapted phrase extraction. To calculate the weights, we first phrase align the general bilingual training data, then, using domain specific translation models, the aligned ...

متن کامل

Statistical machine translation with cascaded probabilistic transducers

Statistical machine translation is based on the idea to extract information from bilingual corpora, which can be used to generate new translations. The current work combines aspects from example-based machine translation and from grammar-based approaches, esp. bilingual regular grammars, to develop a statistical translation system based on cascaded transducers. These transducers can be construc...

متن کامل

Adjunct Alignment in Translation Data with an Application to Phrase-Based Statistical Machine Translation

Enriching statistical models with linguistic knowledge has been a major concern in Machine Translation (MT). In monolingual data, adjuncts are optional constituents contributing secondarily to the meaning of a sentence. One can therefore hypothesize that this secondary status is preserved in translation, and thus that adjuncts may align consistently with their adjunct translations, suggesting t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005